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Abstract 

The diameter fc-clustering problem is the problem of partitioning a finite subset of R d into k 
subsets called clusters such that the maximum diameter of the clusters is minimized. One early 
clustering algorithm that computes a hierarchy of approximate solutions to this problem (for all 
values of k) is the agglomerative clustering algorithm with the complete linkage strategy. For decades, 
this algorithm has been widely used by practitioners. However, it is not well studied theoretically. 
In this paper, we analyze the agglomerative complete linkage clustering algorithm. Assuming that 
the dimension d is a constant, we show that for any k the solution computed by this algorithm is an 
<n: 0(log fc)-approximation to the diameter fc-clustering problem. Our analysis does not only hold for the 

Euclidean distance but for any metric that is based on a norm. Furthermore, we analyze the closely 

O related fc-center and discrete fc-center problem. For the corresponding agglomerative algorithms, we 
deduce an approximation factor of O(logfc) as well. 

vo: Keywords: agglomerative clustering, hierarchical clustering, complete linkage, approximation guar- 

antees 
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1 Introduction 



■ Clustering is the process of partitioning a set of objects into subsets (called clusters) such that each subset 
contains similar objects and objects in different subsets are dissimilar. There are many applications for 
clustering, including data compression Q3], analysis of gene expression data [5], anomaly detection [TTj . 
and structuring results of search engines [2J. For every application, a proper objective function is used 

■ to measure the quality of a clustering. One particular objective function is the largest diameter of the 
clusters. If the desired number of clusters k is given, we call the problem of minimizing this objective 

■ function the diameter k-clustering problem. 

One of the earliest and most widely used clustering strategies is agglomerative clustering. The history 

■ of agglomerative clustering goes back at least to the 1950s (see for example [3[T2]). Later, biological 
taxonomy became one of the driving forces of cluster analysis. In |15j the authors, who where the first 
biologists using computers to classify organisms, discuss several agglomerative clustering methods. 

Agglomerative clustering is a bottom-up clustering process. At the beginning, every input object 
forms its own cluster. In each subsequent step, the two 'closest' clusters will be merged until only one 
cluster remains. This clustering process creates a hierarchy of clusters, such that for any two different 
clusters A and B from possibly different levels of the hierarchy we either have A P\ B = A C B, or 
B C A. Such a hierarchy is useful in many applications, for example, when one is interested in hereditary 
properties of the clusters (as in some bioinformatics applications) or if the exact number of clusters is a 
priori unknown. 

In order to define the agglomerative strategy properly we have to specify a distance measure between 
clusters. Given a distance function between data objects, the following distance measures between 
clusters are frequently used. In the single linkage strategy, the distance between two clusters is defined 
as the distance between their closest pair of data objects. Using this strategy is equivalent to computing 
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a minimum spanning tree of the graph induced by the distance function using Kruskal's algorithm. In 
case of the complete linkage strategy, the distance between two clusters is defined as the distance between 
their farthest pair of data objects. In the average linkage strategy the distance is defined as the average 
distance between data objects from the two clusters. 

1.1 Related Work 

In this paper, we study the agglomerative clustering algorithm using the complete linkage strategy to 
find a hierarchical clustering of n points from W l . The running time is obviously polynomial in the 
description length of the input. Therefore, our only goal in this paper is to give an approximation 
guarantee for the diameter fc-clustering problem. The approximation guarantee is given by a factor a 
such that the cost of the fc-clustering computed by the algorithm is at most a times the cost of an optimal 
fc-clustering. Although the agglomerative complete linkage clustering algorithm is widely used, there are 
only few theoretical results considering the quality of the clustering computed by this algorithm. It 
is known that there exists a certain metric distance function such that this algorithm computes a fc- 
clustering with an approximation factor of f2(logfc) However, prior to the analysis we present in 
this paper, no non-trivial upper bound for the approximation guarantee of the classical complete linkage 
agglomerative clustering algorithm was known, and deriving such a bound has been discussed as one of 
the open problems in [3]. 

The diameter fc-clustering problem is closely related to the k-center problem. In this problem, we are 
searching for k centers and the objective is to minimize the maximum distance of any input point to the 
nearest center. When the centers are restricted to come from the set of the input points, the problem is 
called the discrete k-center problem. It is known that for metric distance functions the costs of optimal 
solutions to all three problems are within a factor of 2 from each other. 

For the Euclidean case, we know that for fixed k, i.e. when we are not interested in a hierarchy 
of clusterings, the diameter fc-clustering problem and the fc-center problem are TVP-hard. In fact, it 
is already ./VP-hard to approximate both problems with an approximation factor below 1.96 and 1.82 
respectively [5]. 

Furthermore, there exist provably good approximation algorithms in this case. For the discrete fc- 
center problem, a simple 2-approximation algorithm is known for metric spaces [5], which immediately 
yields a 4-approximation algorithm for the diameter fc-clustering problem. For the fc-center problem, a 
variety of results is known. For example, for the Euclidean metric in [T] a (l + e)-approximation algorithm 
with running time 2°^ kl ° sk ^ 2 ^dn is shown. This implies a (2 + e)-approximation algorithm with the same 
running time for the diameter fc-clustering problem. 

Also, for metric spaces a hierarchical clustering strategy with an approximation guarantee of 8 for 
the discrete fc-center problem is known [1] . This implies an algorithm with an approximation guarantee 
of 16 for the diameter fc-clustering problem. 

This paper as well as all of the above mentioned work is about static clustering, i.e. in the problem 
definition we are given the whole set of input points at once. An alternative model of the input data is to 
consider sequences of points that are given one after another. In [3], the authors discuss clustering in a 
so-called incremental clustering model. They give an algorithm with constant approximation factor that 
maintains a hierarchical clustering while new points are added to the input set. Furthermore, they show 
a lower bound of ^(log fc) for the agglomerative complete linkage algorithm and the diameter fc-clustering 
problem. However, since their model differs from ours, their results have no bearing on our results. 

1.2 Our contribution 

In this paper, we study the agglomerative complete linkage clustering algorithm and related algorithms 
for input sets IcKf To measure the distance between data points, we use a metric that is based on a 
norm, e.g., the Euclidean metric. We prove that the agglomerative solution to the diameter fc-clustering 
problem is an 0(log fc)-approximation. Here, the O-notation hides a constant that is doubly exponential 
in d. This approximation guarantee holds for every level of the hierarchy computed by the algorithm. 
That is, we compare each computed fc-clustering with an optimal solution for that particular value of 
fc. These optimal fc-clusterings do not necessarily form a hierarchy. In fact, there are simple examples 
where optimal solutions have no hierarchical structure. 
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Our analysis also yields that if we allow 2k instead of k clusters and compare the cost of the computed 
2fc-clustering to an optimal solution with k clusters, the approximation factor is independent of k and 
depends only on d. Moreover, the techniques of our analysis can be applied to prove stronger results 
for the fc-center problem and the discrete fc-center problem. For the fc-center problem, we derive an 
approximation guarantee that is logarithmic in k and only singly exponential in d. For the discrete 
/c-center problem, we derive an approximation guarantee that is logarithmic in k and the dependence on 
d is only linear and additive. 

Furthermore, we give almost matching upper and lower bounds for the one-dimensional case. These 
bounds are independent of k. For d > 2 and the metric based on the ^oo-norm, we provide a lower bound 
that exceeds the upper bound for d — 1. For d > 3, we give a lower bound for the Euclidean case which 
is larger than the lower bound for d = 1. Finally, we construct instances providing lower bounds for any 
metric based on an £ p -norm with 1 < p < oo. However, the construction of these instances needs the 
dimension to depend on k. 



2 Preliminaries and problem definitions 

Throughout this paper, we consider input sets that are finite subsets of R d . Our results hold for arbitrary 
metrics that are based on a norm, i.e., the distance \\x — y\\ between two points x,y g R d is measured 
using an arbitrary norm | • 1 1 . Readers who are not familiar with arbitrary metrics or are only interested 

in the Euclidean case, may assume that || ■ 1 1 3 is used, i.e. \\x — y\\ — \J Y^i=i( x i ~ Vi) 2 - F° r r£l and 
y £ M. d , we denote the closed d-dimensional ball of radius r centered at y by B d (y) :— {x \ \\x — y\\ < r}. 

Given k £ N and a finite set X C M. d with k < \X\, we say that Ck — {Ci, ■ • ■ , Ck} is a /c-clustering 
of X if the sets G\, . . . ,Ck (called clusters) form a partition of X into k non-empty subsets. We call a 
collection of /c-clusterings of the same finite set X but for different values of k hierarchical, if it fulfills 
the following two properties. First, for any 1 < k < \X\ the collection contains at most one ^-clustering. 
Second, for any two of its clusterings Ci,Cj with \Ci\ = i < j — \Cj\, every cluster in Ci is the union of 
one or more clusters from Cj . A hierarchical collection of clusterings is called a hierarchical clustering. 

We define the diameter of a finite and non-empty set C C M. d to be diam(C) := max^^c \\x — y\\. 
Furthermore, we define the diameter cost of a fc-clustering Ck as its largest diameter, i.e. costdiam(Cfc) := 
maxc 6 Cfc diam(C). The radius of C is defined as rad(C) := min^Rd max l6( 7 ||a; — y\\ and the radius 
cost of a fc-clustering Ck is defined as its largest radius, i.e. cost ra d(Cfc) := maxc e c t rad(C). Finally, we 
define the discrete radius of C to be drad(C) :— miiiy^c max^gc \\x — y\\ and the discrete radius cost of 
a ^-clustering Ck is defined as its largest discrete radius, i.e. costdrad (Ck ) := rnaxc e c fc drad(C). 

Problem 1 (discrete fc-center). Given k E N and a finite set X cM. d with \X\ > k, find a k-clustering 
Ck of X with minimal discrete radius cost. 

Problem 2 (fc-center) . Given k £ N and a finite set X cM. d with \X\ > k, find a k-clustering Ck of X 
with minimal radius cost. 

Problem 3 (diameter fc-clustering). Given k £ N and a finite set X C M. d with \X\ > k, find a 
k-clustering Ck of X with minimal diameter cost. 

For our analysis of agglomerative clustering, we repeatedly use the volume argument stated in 
Lemma [5] This argument provides an upper bound on the minimum distance between two points from 
a finite set of points lying inside the union of finitely many balls. For the application of this argument, 
the following definition is crucial. 

Definition 4. Let k £ N and r eR. A set X C R d is called (fc, recoverable if there exist y\,...,yk £R d 
withX C{J k t=1 B d ( yi ). 

Lemma 5. Let k £ N, r £ M and P C R d be finite and (k^r)- cover able with \P\ > k. Then, there exist 
distinct p,q £ P such that \\p — q\\ < 4r ^/^pj ■ 

Proof. Let Z C M d with \Z\ = k and P C \J zeZ B d (z). We define 5 to be the minimum distance between 
two points of P, i.e. S := min p ^ e p \\p — q\\. We assume for contradiction that u := ir^/jj^ < S. Since 
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Figure 1: The volume argument. 



I PI > fc there exists z € Z with 



B r d (z)nP 



> 2. It follows that S < 2r and hence, § < r. Note that for 



any y 6 K d , P 6 K, and any norm || • ||, we have vol ^B^(y)^ = R d ■ Vd, where Vd is the volume of the 
d-dimensional unit ball B d (0) (see [HI], Corollary 6.2.15). Therefore, we deduce 

vol ( |J Bf +u/2 (z) \ < J2 vol (B*.(*)) < k ■ (2r) d ■ V d . 
\zez / z£Z 

Furthermore, since any p G P is contained in a ball B d (z) for some z € Z, we conclude that any ball 
By 2 (p) for p £ P is contained in a ball B^ +u/2 (z) for some zeZ (see Figure [1]). Thus, we deduce 



vol {jBt /2 (p) \ <k-{2r) d -V d . (1) 
\peP J 

However, since u < 5, for any distinct p,q g P, we have By 2 (p) n By 2 ((7) = 0. Therefore, the total 
volume of the |P| balls Bf/ 2 (p) is given by 



vol ( |J B* /a (p) ] = |P| (| ) V d = fc • (2r) d • Vi, 



using the definition of u. This contradicts (JT|). We obtain S < u, which proves the lemma. 



□ 



3 Analysis 

In this section we analyze the agglomerative clustering algorithms for the (discrete) fc-center problem and 
the diameter fc-clustering problem. As mentioned in the introduction, an agglomerative algorithm takes 
a bottom-up approach. It starts with the |X|-clustering that contains one cluster for each input point 
and then successively merges two of the remaining clusters such that the cost of the resulting clustering 
is minimized. That is, in each merge step the agglomerative algorithms for Problem [TJ Problem [5] and 
Problem |31 minimize the discrete radius, the radius and the diameter of the resulting cluster, respectively. 

Our main objective is the agglomerative complete linkage clustering algorithm, which minimizes the 
diameter in every step. Nevertheless, we start with the analysis of the agglomerative algorithm for the 
discrete fc-center problem since it is the simplest one of the three. Then we adapt our analysis to the 
fc-center problem and finally to the diameter fc-clustering problem. In each case we need to introduce 
further techniques to deal with the increased complexity of the given problem. 

We show that all three algorithms compute an 0(log fc)-approximation for the particular correspond- 
ing clustering problem. However, the dependency on the dimension which is hidden in the O-notation 
ranges from only linear and additive in case of the discrete fc-center problem to a factor that is doubly 
exponential in case of the diameter fc-clustering problem. 
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AgglomerativeDiscreteRadius(X): 
X finite set of input points from R d 
1: C| X | :={{x}\xeX} 
2: for i = \X\ - do 

3: find distinct clusters A,Bg Cj+i minimizing drad(yl U B) 
4: Cj := (Cj+i \ {A, B}) U {iUB} 
5: end for 

6: return Ci, ... ,C|x 



Algorithm 1: The agglomerative algorithm for the discrete fc-center problem. 

As mentioned in the introduction, the cost of optimal solutions to the three problems are within a 
factor of 2 from each other. That is, each algorithm computes an 0(log /^-approximation for all three 
problems. However, we will analyze the proper agglomerative algorithm for each problem. 

3.1 Discrete A;-center clustering 

The agglomerative algorithm for the discrete fc-center problem is stated as Algorithm [T] Given a finite 
set X C M. d of input points, the algorithm computes hierarchical fc-clusterings for all values of fc between 
1 and \X\. We denote them by C%, . . . >C\x\- Throughout this section, cost always means discrete radius 
cost. opt fe refers to the cost of an optimal discrete fc-center clustering of X C R d where fc € N with 
k < \X\ 1 i.e. the cost of an optimal solution to ProblcmQ] 

The following theorem states our result for the discrete fc-center problem. 

Theorem 6. Let X C M. d be a finite set of points. Then, for all k € N with k < \X\, the partition Ck of 
X into k clusters as computed by Algorithm]]] satisfies 

cost drad (C fc ) < (20d + 21og 2 (fc) + 2) • opt fe , 

where opt k denotes the cost of an optimal solution to Problem]]] 

Since any cluster C is contained in a ball of radius drad(C), we have that X is (fc, cost dr ad(C/c))- 
coverable for any fc-clustering Ck of X. It follows, that X is (fc, opt fe )-coverable. This fact, as well as the 
following observation about the greedy strategy of Algorithm [TJ will be used frequently in our analysis.. 

Observation 7. The cost of all computed clusterings is equal to the discrete radius of the cluster created 
last. Furthermore, the discrete radius of the union of any two clusters is always an upper bound for the 
cost of the clustering to be computed next. 

We prove Theorem [5] in two steps. First, Proposition |S] in Section [3.1.11 provides an upper bound to 
the cost of the intermediate 2fc-clustering. This upper bound is independent of fc and \X\, only linear in 
d and may be of independent interest. In its proof, we use Lemma [5] to bound the distance between the 
centers of pairs of remaining clusters. The cost of merging such a pair gives an upper bound to the cost 
of the next merge step. Therefore, we can bound the discrete radius of the created cluster by the sum of 
the larger of the two clusters' discrete radii and the distance between their centers. 

Second, in Section 13.1.21 we analyze the remaining fc merge steps of Algorithm [1] down to the com- 
putation of the fc-clustering. There, we no longer need to apply the volume argument from Lemma [5] 
to bound the distance between two cluster centers. It will be replaced by a very simple bound that is 
already sufficient. Analogously to the first step, this leads to a bound for the cost of merging a pair of 
clusters. 

3.1.1 Analysis of the 2fc-clustering 

Proposition 8. Let X C M. d be finite. Then, for all k G N with 2k < \X\, the partition C^k of X into 
2k clusters as computed by Algorithm]]] satisfies 

cost drad (C 2 /c) < 20d ■ opt k , 

where opt k denotes the cost of an optimal solution to Problem]]] 
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Figure 2: drad(Ci U C 2 ) < cost drad (C m ) + ||pci -Pc 2 \\- 



To prove Proposition 13 . 1 . il we divide the merge steps of Algorithm [T] into phases, each reducing the 
number of remaining clusters by one fourth. The following lemma bounds the increase of the cost during 
a single phase by an additive term. 



Lemma 9. Let m € N with 2k < m < |A| 



COStdrad (c 



Then, 

< COStdrad (C 



* 2k 

+ 4 \/— -optfc 



Proof. Let R := cost d rad(Cm)- From every cluster C € C m , we fix a center pc € C with C C B^(pc)- 
Let t := I \ . Then, C m n C t +i is the set of clusters from C m that still exist [^1 — 1 merge steps 

after the computation of C m . In each iteration of its loop, the algorithm can merge at most two clusters 

from C m - Thus, \C m nC t+1 \ > f. 

Let P := {p c \ C € C m CiC t+1 }. Then, \P\ = \C m nC t +i\ > f > fc. Since A is (k, opt fe )-coverable, 

so is P C A. Therefore, by Lemma [5] there exist distinct C\,Ci £ C m C\C t+ i such that \\pci — Pc 2 \\ < 

4 \/m ' °P t fc- Then, the distance from pc 1 ! to any g € C*2 is at most 4^/|£ • opt fc +i2. We conclude 
that merging C\ and C% would result in a cluster whose discrete radius can be upper bounded by 
drad(Ci U C 2 ) < cost drad (C m ) + 4yf^ • opt fc (see Figure EJ. The result follows using d, C 2 € C t+ i and 
Observation [71 □ 



To prove Proposition |8l we apply Lemma [9] for logj 



\x\ 

2k 



consecutive phases. 



Proof of Proposition [3 Let u 



log 



\x\ 

2k 



m u < 2k and m s > 2k for all i = 0, . . . , u 



and define m,i 
- 1. Since 



(!) 


*|A| 


for 


all i 


3 




m 


< 


4 







= 0, .. 

, 3 xt+l 



Then, 



A| 



< 



A| 



for all i = 0, 



TOi+i and Algorithm [T] uses a greedy strategy, we get costd ra d(C mi+1 ) < costdrad(^i Sm±\ ) 
, u— 1. Combining this with Lemma |H] (applied to m = m,-), we obtain 



COStdrad (C TOi+1 ) < COSt d r ad (C m J + 4 l 



opt A 



By repeatedly applying this inequality for i = 0, . . . ,u — 1 and using costdrad(C 2 fc) — costdrad(C m „) and 



COStdrad (C II 



0, we deduce 



u- 1 



opt fc < 4 



COStdrad (C 2 k) < 4 V/ 
»=0 V * 

I X I 

Solving the geometric series and using u — 1 < log 4 leads to 

2k 




COStdrad (C 2 fc) < 4 (/jYT 




opt fc < 



i/4 _ 1 
3 1 



opt fc 



(2) 



G 



By taking only the first two terms of the series expansion of the exponential function, we get y | 
e d > H — 3^ • Substituting this bound into © gives 

V 3 

cost drad (C 2 fe) < tt^ • opt fc < 20d • opt fe . 
ln 3 



□ 



3.1.2 Analysis of the remaining merge steps 



The analysis of the remaining merge steps introduces the Oilogk) term to the approximation factor 
of our result. It is similar to the analysis used in the proof of Proposition |8l Again, we divide the 
merge steps into phases. However, this time one phase consists of one half of the remaining merge steps. 
Furthermore, we are able to replace the volume argument from Lemma O by a simpler bound. More 
precisely, as long as there are more than k clusters left, we are able to find a pair of clusters whose 
centers lie in the same cluster of an optimal fc-clustering. That is, the distance between the centers is 
at most two times the discrete radius of the common cluster in the optimal clustering. The following 
lemma bounds the increase of the cost during a single phase. 

Lemma 10. Let m E N with k < m < \X\. Then, 

COStdrad (C fe+ L^j ) < COSt drad (C m ) + 2 Opt fc . 

Proof. Let R := costdrad (C m ). From every cluster C <E C m , we fix a center pc € C with C C B^pc). 

Let t := k + ■ Then, C m n C t +i is the set of clusters from C m that still exist f 22 ^] - 1 merge 

steps after the computation of C m . In each iteration of its loop, the algorithm can merge at most two 
clusters from C m . Thus, \C rn n Ct+i| > k. 

Let P := {pc \ C £ C m nCt+i}- Since X is (fc, opt fc )-coverable, so is P C X. Therefore, using \P\ > k 
it follows that there exist distinct C\,Ci £ C m C\Ct+\ such that pc 1 and pc 2 are contained in the same 
ball of radius opt fe , i.e. \\pci ~ Pc 2 \\ ^ 2opt fc . Then, the distance from pc 1 to any q £ Ci is at most 
2 opt fe +R. We conclude that merging C\ and C2 would result in a cluster whose discrete radius can 
be upper bounded by drad(Ci U C2) < costdrad (C m ) + 2opt fe (see Figure [5]). The result follows using 
C\,C2 € Ct+i and Observation [7] □ 

To prove Theorem [SJ we apply Lemma [TU1 for about logfc consecutive phases. 

Proof of Proposition® Let e > and u := [log 2 (fc) + e] such that log 2 k < u < log 2 (fc) + l. Furthermore, 
define := k + (|)' fcj for all i = 0, . . . , u. Then, m u = k and > k for all i = 0, . . . , u — 1. Since 

k + [ mi 2 ~ fc j = k + \ (i) 1 k < k + fcj = nij+i and Algorithm [1] uses a greedy strategy, we 

deduce costdrad (C mj+1 ) < costdrad \C i mj -ij i ] for all i = 0, . . . , u — 1. Combining this with Lemma [TU] 



(applied to m — mi), we obtain 

COStdiam (Cmi+i) < COStdrad (C m J + 2 Opt fc . 

By repeatedly applying this inequality for i = 0, . . . , u — 1 and using mo = 2k, we get 

u-l 

COStdrad(Cfc) < COSt d rad(C 2 fe) + 2 Opt fc < COSt dra d (C 2 fc ) + (21og 2 (fc) + 2) ■ Opt fc . 

i=0 

Hence, the result follows using Proposition [SJ □ 
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AgglomerativeRadius(X): 
X finite set of input points from R d 



1: 
2: 
3: 
1: 
5: 
6: 



C\x\ :={{x}\xeX} 
for i = |X| do 



find distinct clusters A, B S Cj+i minimizing rad(A U -B) 



Cj := (C i+1 \{A,B}) U{AUB} 



end for 

return &,..., Cpfl 



Algorithm 2: The agglomerative algorithm for the fc-center problem. 



3.2 /c-center clustering 

The agglomerative algorithm for the fc-center problem is stated as Algorithm [5] The only difference to 
Algorithm [1] is the minimization of the radius instead of the discrete radius in Step [3] 

In the following, cost always means radius cost and opt fc refers to the cost of an optimal fc-center 
clustering of X C R d where k G N with k < \X\. 

Observation 11 (analogous to Observation [7]) . The cost of all computed clusterings is equal to the 
radius of the cluster created last. Furthermore, the radius of the union of any two clusters is always an 
upper bound for the cost of the clustering to be computed next. 

The following theorem states our result for the fc-center problem. 

Theorem 12. Let X C M. d be a finite set of points. Then, for all k E N with k < \X\, the partition Ck 
of X into k clusters as computed by Algorithm^ satisfies 



where opt fe denotes the cost of an optimal solution to Problem and the constant hidden in the O- 
notation is singly exponential in the dimension d. 

Theorem [12] holds for any particular tie-breaking strategy. However, to keep the analysis simple, we 
assume that there are no ties. That is, we assume that for any input set X the clusterings computed by 
Algorithm [2] are uniquely determined. As in the proof of Theorem [51 we first show a bound for the cost 
of the intermediate 2fc-clustering. However, we have to apply a different analysis. As a consequence, the 
dependency on the dimension increases from linear and additive to a singly exponential factor. 

3.2.1 Analysis of the 2fc-clustering 

Proposition 13. Let X C K d be finite. Then, for all k £ N with 2k < \X\, the partition C^k of X into 
2k clusters as computed by Algorithm^ satisfies 



where opt fc denotes the cost of an optimal solution to Problem^ 

Just as in the analysis of Algorithm [TJ we divide the merge steps of Algorithm [5] into phases, such 
that in each phase the number of remaining clusters is reduced by one fourth. Like in the discrete case, 
the input points are (fc, opt fc )-coverable. However, centers corresponding to an intermediate solution 
computed by Algorithm [5] need not be covered by the k balls induced by an optimal solution. As a 
consequence, we are no longer able to apply Lemma [S] on the centers as in the discrete case. 

To bound the increase of the cost during a single phase, we cover the remaining clusters at the 
beginning of a phase by a set of overlapping balls. Each of the clusters is completely contained in one 
of these balls that all have the same radius. Furthermore, the number of remaining clusters will be at 
least twice the number of these balls. It follows that there are many pairs of clusters that are contained 
in the same ball. Then, as long as the existence of at least one such pair can be guaranteed, the radius 
of the cluster created next can be bounded by the radius of the balls. The following lemma will be used 
to bound the increase of the cost during a single phase. 



cost rad (C fc ) = O(logfc) • opt fc , 



cost rad (C 2 fc) < 24d • e 24d • opt fc , 
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Figure 3: Intermediate centers. 



Lemma 14. Let m € N with 2k < m < \X\. Then. 



cost 



St ra d J ) < ( 1 + 6 ■ COSt rad (C m ) + 6 • opt , 



Proof. Let T 5 = {Pi, . . . , Pk] be an optimal fc-clustering of X . We fix yi, . . . , y% £ M. d such that Pi C 
B opt)i .(?/i) for i = 1, . . . , k. For any C G C m let zc <£ K. d such that C C B costrad ( Cm )(zc). It follows that 
each zc is contained in at least one of the balls B optfc + C ost rad (c m )(2/i) for i = 1, . . . , fc (see Figure [3]). 



For A e M with A > a ball of radius opt fc +cost ra d(C m ) can be covered by 



balls of radius 



A (opt fc + cost ra d(C m )) (see [13]). Choosing A 
for i = 1 , . . . , k can be covered by £ 



we get that each of the balls B optfc + C ost rad (e m )(^) 



Lf|J < § balls of radius e 



Therefore, there exist k ■ I < ^ balls B\ 



(opt fe + cost rad (C m )). 



, Bm of radius e such that each zc for C € C m is contained 
in at least one of these balls. For i — 1, . . . , k£ let g M. d such that Bi = B E (di). Then, any cluster 
C G C m is contained in at least one of the balls A±, . . . , A^e with A; = B costrad (c m ) +e (eti) for i = 1, . . . , ki 
(see Figure IU). 




Figure 4: Covering centers and clusters. 

Let t := Lt^J an d C m nC t+ i be the set of clusters from C m that still exist | ^] — 1 merge steps after 
the computation of C m . In each iteration of its loop, the algorithm can merge at most two clusters from 
C m . Thus, \C m nC t+1 \ > f . 

Since kl < there exist two clusters C\,Ci € C m n C t+ i that are contained in the same ball Aj 
with i e {1, . . . , ki}. Therefore, merging clusters C\ and C2 would result in a cluster whose radius can 
be upper bounded by rad(Ci U C2) < cost ra d(C m ) + e. Using Observation [TT1 and the fact that C\ and 
C2 are part of the clustering Ct+i, we can upper bound the cost of Ct by 

cost rad (C t ) < COSt rad (C m ) +£. 
It remains to show e < Qy^ • (opt fc + cost rad (C m )) . Since || > 1, we have || < 2 [||J. Thus, 




□ 



9 



To prove Proposition II 31 we apply Lemma [LH for log ± ^ consecutive phases 



I A' I 



and define m; 



:d 1*1 



for all i = 0, . . . , u. Then, 



Proof of Provosition \13[ Let u :— 
m u < 2k and mi > 2fc for all i 

L^p-J < and using Lemma[T31 we deduce cost ra( j(C mi+1 ) < (l + - cost ra d(C mi )+6i 

for all i = 0, . . . , it — 1. By repeatedly applying this inequality and using cost ra d(C2fc) < cost ra d(C m „) and 
cost rad (C mo ) = 0, we get 



0, ...,u — 1. Analogously to the proof of Proposition [SJ we get 

■optfc 



cost rad (C 2fc ) <J2\ 6 \i 



< 6 




(3) 



opt fe . 



(4) 



Here, we obtain @ using m, > 

2k 



\X\ 

u — 1 < logi J^r 1 , we deduce 



\X\ and we obtain (0| by substituting u — 1 — i for i. Using 



u-l 



cost rad (C 2 fc) < 6 ^2 




opt k . 



(5) 



By taking only the first two terms of the series expansion of the exponential function, we get 1 + 6 
and therefore 



< 



n('+«(i 



i-1 



< 



3=0 



JJ e 6(l)M =e 6EjlS(|)- 



(6) 



The sum in the exponent can be bounded by the infinite geometric series 



£(1 

3=0 



3\ 3 



< 



< id. 



(7) 



in the interval 



where the last inequality follows by upper bounding the convex function f(x) 
[0, 1] by the line through /(0) and /(l). Putting Inequalities ([5]), ([6]) and © together then gives 

u-l 



cost rad (C 2 fe) < 6^ 



i=0 



3\ d 



„24d 



opt fe < 24d- e 24d • opt fc , 



where the last inequality follows by using Inequality ([7]) again. 



□ 



3.2.2 Connected instances 

The analysis of the remaining merge steps from the discrete fc-center case (cf. Section 13.1.21) is not 
transferable to the fc-center case. Again, as in the proof of Proposition [T31 we are no longer able to 
derive a simple additive bound on the increase of the cost when merging two clusters. In order to 
preserve the logarithmic dependency of the approximation factor on fc, we show that it is sufficient to 
analyze Algorithm [2] on a subset Y C X satisfying a certain connectivity property. Using this property, 
we are able to apply a combinatorial approach that relies on the number of merge steps left. 

We start by defining the connectivity property that will be used to relate clusters to an optimal 
fc-clustering. 



10 



Definition 15. Let Z C R d and r G R. Two sets A, B C R d are called (Z, reconnected if there exists 
a z G Z with Bf(z) DA^0 and Bf(z) flB/0. 

Note that for any two (Z, reconnected clusters A,B, we have 

rad(A UB)< rad(A) + 2 ■ rad(B) + 2r. (8) 

Next, we show that for any input set X we can bound the cost of the fc-clustering computed by 
Algorithm [5] by the cost of the ^-clustering computed by the algorithm on a connected subset Y C X 
for a proper £ < k. Recall that by our convention from the beginning of Section I3~2l the clusterings 
computed by Algorithm [2] on a particular input set are uniquely determined. 

Lemma 16. Let X C R be finite and k G N with k < \X\. Then, there exists a subset Y C X, a 
number £ G N with £ < k and £ < \Y\, and a set Z C R d with \Z\ = £ such that: 

1. Y is [£, opt k )-coverable; 

2. cost rad (C fe ) < cost rad (7^)/ 

3. For all n € N with £ + 1 < n < \Y\, every cluster in V n is (Z, opt k )- connected to another cluster 
in V n ■ 

Here, the collection V\, . . . , V\y\ denotes the hierarchical clustering computed by Algorithm^ on input Y . 

Proof. To define Y, Z, and £ we consider the (k + l)-clustering computed by Algorithm [2] on input X. 
We know that X — LUeCt+i ^ * s V s > opt fc )-coverable. Let E C Ck+i be a minimal subset such that 
[Jage^ * s (l-^l — 1, opt fc )-coverable, i.e., for all sets F C C fe+ i with |F| < \E\ the union IJagf ^ * s n °t 
(|F| — 1, opt fc )-coverable. Since a set F of size 1 cannot be (|F| — 1, opt fe )-coverable, we get \E\ > 2. 

Let Y := Uas_e ^ an< ^ ^ := l-^l — 1- Then, £ < k and Y is (£, opt fc )-coverable. This establishes 
property [1] 

It follows that there exists a set Z C M d with |Z| = £ and Y c Uzez ^opt ( z )- Furthermore, we let 
Pi, . . . jPiyi be the hierarchical clustering computed by Algorithm [5] on input Y. 

Since Y is the union of the clusters from E C Ck+i, each merge step between the computation of 
C\x\ and C k +i merges either two clusters A, B C Y or two clusters A, F C X \ Y. The merge steps 
inside X \ Y have no influence on the clusters inside Y. Furthermore, the merge steps inside Y would 
be the same in the absence of the clusters inside X \ Y. Therefore, on input Y, Algorithm [5] computes 
the {£ + l)-clustering V e+ i = E = C k+1 n 2 Y . Thus, V e+ i C C fe +i. 

To compute on input Y, Algorithm [5] merges two clusters from Ve+i that minimize the radius of 
the resulting cluster. Analogously, on input X, Algorithm [2] merges two clusters from C k +i to compute 
Cfe. Since Ve+i C Cfc+i, Observation fTTI implies cost ra d(Cfc) < cost ra d(P£), thus proving property[2] 

It remains to show that for all n G N with £ + 1 < n < |Y| it holds that every cluster in V n is 
(Z, opt fc )-connected to another cluster in V n (property \5§ . By the definition of Z, ever cluster in V n 
intersects at least one ball B^ ptk ( z ) f° r z e Z. Therefore, it is enough to show that each ball B^ ptk ( z ) 
intersects at least two clusters from V n - We first show this property for n = £ + 1. For £ = 1 this 
follows from the fact that B^ ptk (z) with Z = {z} has to contain both clusters from V%. For I > 1, we are 
otherwise able to remove one cluster from Vi+\ and get £ clusters whose union is (£ — 1, opt fe )-coverable. 
This contradicts the definition of E — Vi+i as a minimal subset with this property. 

To show property [3] for general n, let C\ G V n and z G Z with B opt (z) fl C\ ^ 0. There exists 
a unique cluster C\ G 7-V+i with C± C Ci. Then, we have BjJ ptfc (z) n Ci 7^ 0. However, we have just 
shown that B^ ptk (z) has to intersect at least two clusters from Ve+i- Thus, there exists another cluster 
C2 G V1+1 with B^ ptfc (z) n C*2 7^ 0. Since every cluster from Vg+i is a union of clusters from V n , there 
exists at least one cluster C2 G V n with C2 C C*2 and B^p^ (z) fl C2 ^ 0. □ 

3.2.3 Analysis of the remaining merge steps 

Let Y, Z, £, and Vi, ... , £W| be as given by Lemma Hoi Then, Proposition [T3] can be used to obtain an 
upper bound for the cost of Vu. In the following, we analyze the merge steps leading from V21 to Ve+i 
and show how to obtain an upper bound for the cost of "Pi+y. As in Section [3.2.11 we analyze the merge 
steps in phases. The following lemma is used to bound the increase of the cost during a single phase. 
Note that opt fe still refers to the cost of an optimal solution on input X, not Y. 
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Figure 5: Merging (Z, opt fe )-connected clusters. 



Lemma 17. Let m, n £ N with n < 2£ and £ < m < n < \Y\. If there are no two (Z, opt k )- connected 
clusters in V m C\V n , it holds 

COStrarffPl m+lj ) < COStrad^m) + 2 • COSt rad (:P„) + 2 Opt fc . 

Proof. We show that there exist at least m — I disjoint pairs of clusters from V m such that the radius 
of their union can be upper bounded by cost ra d(7 , m ) + 2 • costrad('Pn) + 2opt fc . By Observation [TT1 this 
upper bounds the cost of the computed clusterings as long as such a pair of clusters remains. Then, 
the lemma follows from the fact that in each iteration of its loop the algorithm can destroy at most two 
of these pairs. To bound the number of these pairs of clusters, we start with a structural observation. 
V m H V n is the set of clusters from V n that still exist in V m ■ By our definition of Y, Z, and £, we conclude 
that any cluster A £ V m P\ V n is (Z, opt fe )-connected to another cluster B £ V m - If we assume that there 
are no two (Z, opt fc )-connected clusters in V m r\V n , this implies B £ V m \V n (see Figure^. Thus, using 
A £ V n , B £ V m , and Inequality (O, the radius of A U B can be bounded by 



rad(A U B) < cost rad (7? m ) + 2 • cost rad (P n ) + 2 opt fe 



(9) 



Moreover, using a similar argument, we derive the same bound for two clusters A\,A^ £ Vm H V n that 
are (Z, opt fe )-connected to the same cluster B £ V m \ V n - That is, 



rad(Ai U A 2 ) < cost rad (:P m ) + 2 • cost rad (P„) + 2opt fc 



(10) 



Next, we show that there exist at least 



disjoint pairs of clusters from V m such that the 



radius of their union can be bounded either by Inequality ^ or by Inequality (|TU| . To do so, we first 
consider the pairs of clusters from V m C\V n that are (Z, opt fe )-connected to the same cluster from V m \V n 
until no candidates are left. For these pairs, we can bound the radius of their union by Inequality (|10p . 
Then, each cluster from V m \ V n is [Z 1 opt fc )-connected to at most one of the remaining clusters from 
V m n V n - Thus, each remaining cluster A £ V m n V n can be paired with a different cluster B £ V m \ V n 
such that A and B are (Z, opt fe )-connected. For these pairs, we can bound the radius of their union by 
Inequality ([9]) . Since for all pairs either one or both of the clusters come from the set V m H V n , we can 

lower bound the number of pairs by 



To complete the proof, we show that m — £ < 



\v m nv n 



In each iteration of its loop, the algorithm 



can merge at most two clusters from V n - Therefore, there are at least 



the computations of V n and V m - Hence, m < n 



n-\v m nv n \ 



2 



m -£<i^HEA. 



merge steps between 
Using n < 2£, we get 
□ 



Lemma 18. Let n £ N with n <2£ and £ <n <\Y\. Then, 

COStrad^+l) < 2(l0g 2 (f) + 2) • (cOSt rad {V n ) + Opt fe ) . 

Proof. For n = £ + 1 there is nothing to show. Hence, assume n > £ + 1. Then, by definition of Z, there 
exist two (Z, opt fe )-connected clusters in V n . Now let h £ N with h < n be maximal such that no two 
(Z, opt fc )-connected clusters exist in Vh^V n - The number h is well-defined since \V\\ =1 implies h > 1. 
It follows that the same holds for all m £ N with m < h. We conclude that Lemma [T71 is applicable for 
all m £ N with I < m < h. 
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By the definition of n there still exist at least two (Z, opt fc )-connected clusters in Vh+i H V n . Then, 
Observation [TT] implies 

COStradCPfi) < 2 • COSt ra dfPn) + «pt fc . (11) 

If n < I + 1 then Inequality (jlll) proves the lemma. For n > £ + 1 let u := \\og 2 (n — £)~\ and define 
(hY ( n ^ £) + ^ > £ f° r alH = 0, . . . , u. Then, mo = n and m„ = I + 1. Furthermore, we obtain 

m = [a r (« - o + /i + 1 J < [i ((ir (» - o + < + 1) + 



2 



Since Algorithm [U uses a greedy strategy, we deduce cost ra( j(7 : ' mi+1 ) < cjostradiV y^+i i ) for all i — 
0, . . . , u — 1. Combining this with Lemma ITTl (applied to m = m,), we obtain 

COSt rad (P m<+1 ) < COStrad^mt) + 2 ' COSt ra d("Pn) + 2 Opt fc . 

By repeatedly applying this inequality for i = 0, — 1 and summing up the costs, we get 

JUJ 

costradCPmJ < cost rad (Vn ) + 2u • (cost rad (P n ) + opt fc ) < 2(u + 1) • (cost rad (P„) + opt fc ) . 

Since h < 21, we get u < log 2 (£) + 1 and the lemma follows using m u = I + 1 . □ 

The following lemma finishes the analysis except for the last merge step. 

Lemma 19. Let Y C M. d be finite and I < \Y\ such that Y is (£, opt k )-coverable. Furthermore, let 
Z C M. d with \Z\ = £ such that for all n £ N with £ + 1 < n < \Y\ every cluster in V n (-^jOpt fe )- 
connected to another cluster in V n , where V\, . . . ,V\y\ denotes the hierarchical clustering computed by 
Algorithm^ on input Y. Then, 

cost rad (7>£+i) < 2(log 2 (l) + 2) ■ (24d ■ e 2M + l) • opt fc . 

Proof. Let n := min(|y|, 2£). Then, using Proposition [T31 we get cost rad (V n ) < 24<i • e 2Ad ■ opt fc . The 
lemma follows by using this bound in combination with Lemma 1181 □ 

3.2.4 Proof of Theorem [12] 

Using Lemma [TBI we know that there is a subset Y C X, a number I < k, and a hierarchical clustering 
Vi, ■ ■ ■ ,V\y\ of Y with cost ra d (Ck) < cost ra d(7 : '^). Furthermore, there is a set Z c M d such that every 
cluster from Ve+i is (Z, opt fe )-connected to another cluster in Vi+i- Thus, Ve+i contains two clusters 
A, B that intersect with the same ball of radius opt fe . Hence 

cost ra d(Cfe) < rad(A U B) < 2 ■ costrad^+i) + opt fe . 

The theorem follows using Lemma ITO! and £ < k. □ 



3.3 Diameter /c-clustering 

In this section, we analyze the agglomerative complete linkage clustering algorithm for Problem [3] stated 
as Algorithm [3J Again, the only difference to Algorithm [JJ and [5] is the minimization of the diameter 
in Step [3] As in the analysis of Algorithm [51 we may assume that for any input set X the clusterings 
computed by Algorithm [3] are uniquely determined, i.e. the minimum in Step [3] is always unambiguous. 

Note that in this section cost always means diameter cost and opt fc refers to the cost of an optimal 
diameter /c-clustering of X C M. d where k € N with k < \X\. Analogously to the (discrete) radius case, 
any cluster C is contained in a ball of radius diam(C) and thus the set X is (k, opt fc )-coverable. 

Observation 20 (analogous to Observation [71 and 1 1 ip . The cost of all computed clusterings is equal to 
the diameter of the cluster created last. Furthermore, the diameter of the union of any two clusters is 
always an upper bound for the cost of the clustering to be computed next. 
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AgglomerativeCompleteLinkage(X): 
X finite set of input points from R d 



1: 
2: 
3: 
1: 
5: 
6: 



C\x\ :={{*}|*ex> 
for i = |.X| do 



find distinct clusters A, B S Ci+i minimizing diam(yl U B) 



Ct := (C l+1 \{A,B}) U {AUB} 



end for 

return &,... jCnq 



Algorithm 3: The agglomerative complete linkage clustering algorithm. 



The following theorem states our main result. 

Theorem 21. Let X C M d be a finite set of points. Then, for all k G N with k < |X|, i/ie partition Ck 
of X into k clusters as computed by Algorithm^ satisfies 



where opt fe denotes the cost of an optimal solution to Problem and the constant hidden in the O- 
notation is doubly exponential in the dimension d. 

As in the proof of Theorem [5] and [T21 we first show a bound for the cost of the intermediate 2k- 
clustering. However, we have to apply a different analysis again. This time, the new analysis results in 
a bound that depends doubly exponential on the dimension. 

3.3.1 Analysis of the 2fc-clustering 

Proposition 22. Let X cM. d be finite. Then, for all k G N with 2k < \X\, the partition Cik of X into 
2k clusters as computed by Algorithm^ satisfies 



where a = (A2d) d and opt fe denotes the cost of an optimal solution to Problem^ 

In our analysis of the fc-center problem, we made use of the fact that merging two clusters lying 
inside a ball of some radius r results in a new cluster of radius at most r. This is no longer true for the 
diameter fc-clustering problem. We are not able to derive a bound for the diameter of the new cluster 
that is significantly less than 2r. The additional factor of 2 makes our analysis from Section 13.2. II useless 
for the diameter case. 

To prove Proposition |2"21 we divide the merge steps of Algorithm [3J into two stages. The first stage 
consists of the merge steps down to a 2 2 ° ( E ' fc-clustering. The analysis of the first stage is based on 
the following notion of similarity. Two clusters are called similar if one cluster can be translated such 
that every point of the translated cluster is near a point of the second cluster. Then, by merging similar 
clusters, the diameter essentially increases by the length of the translation vector. During the first stage, 
we guarantee that there is a sufficiently large number of similar clusters left. The cost of the intermediate 
2 2 ° log<i fc-clustering can be upper bounded by 0(d) ■ opt fc . 

The second stage consists of the steps reducing the number of remaining clusters from 2 2 E k to 
only 2k. In this stage, we are no longer able to guarantee that a sufficiently large number of similar 
clusters exists. Therefore, we analyze the merge steps of the second stage using a weaker argument, very 
similar to the one used in the second step of the analysis in the discrete fc-center case (cf. Section l3.1.2l) . 
As long as there are more than 2k clusters left, we are able to find sufficiently many pairs of clusters that 
intersect with the same cluster of an optimal /c-clustering. Therefore, we can bound the cost of merging 
such a pair by the sum of the diameters of the two clusters plus the diameter of the optimal cluster. We 
find that the cost of the intermediate 2/c-clustering is upper bounded by 2 2 ° ( 8 ' • opt fe . Let us remark 
that we do not obtain our main result if we already use this argument for the first stage. 

Both stages are again subdivided into phases, such that in each phase the number of remaining 
clusters is reduced by one fourth. 



COStdiam(Cfc) = 0(l0gfc) ■ Opt fe , 



cost diam (C 2 fe) < 2 3ct (28d + 6) • opt fe , 
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Figure 6: Congruent configurations. 



3.3.2 Stage one 

The following lemma will be used to bound the increase of the cost during a single phase. 

Lemma 23. Let AeR with < A < 1 and p := . Furthermore, let m G N with 2 p+1 k < m < \X\. 

Then, 

/2P+ 1 k 

(C[_apj ) < (1 + 2A) • cost d iam(C m ) —— • opt fc . (12) 

Proof. From every cluster C G C m , we fix an arbitrary point and denote it by pc- Let R := costdiam(C m ). 
Then, the distance from pc to any q g C is at most R and we get C — pc C B^(0). 

A ball of radius i? can be covered by p balls of radius Ai? (see [L3]). Hence, there exist yi, . . . , y p G M d 
with B|(0) C Ui=i B Afl(yi)- For c € C m, we call the set Conf(C) := {y 4 | 1 < i < p and Bf R ( yi ) n 
(C — pc) 7^ } the configuration of C. That is, we identify each cluster C € C m with the subset of the 
balls B^ fl (j/i), . . . , B^ R (y p ) that intersect with C — pc- Note that no cluster from C € C m has an empty 
configuration. The number of possible configurations is upper bounded by 2 P . 

Let t :— and C m n C t +i be the set of clusters from C m that still exist [^1 — 1 merge steps after 

the computation of C m . In each iteration of its loop, the algorithm can merge at most two clusters from 
C m . Thus, \C m n C t +i\ > : y. It follows that there exist j > ^fft distinct clusters Ci, . . . , Cj £ C m n C t +i 
with the same configuration. Using m > 2 p+1 fc, we deduce j > k. 

Let P :— {pen ■ ■ ■ >PcA- Since X is (k, opt fc )-coverable, so is P C X. Therefore, by Lemma[5j there 

exist distinct a, b G {1, . . . , j} such that \\pc a — Pc b \\ < 4^/ 2P ^ lfc ' • opt fc . 

Next, we want to bound the diameter of the union of the corresponding clusters C a and Cf,. The 
distance between any two points u, v € C a or u, v € Cf, is at most the cost of C m . Now let u € C a and 
u G Ch. Using the triangle inequality, for any w € M d , we obtain ||u — v\\ < ||pc a — Pc b | + II 11 + Pc b ~ 
Pc a — w\\ + \\w — v\\ (see Figure^. 

For \\pc a — Pc b \\i we j us t derived an upper bound. To bound \\u + pc b ~ Pc a ~ w \\i we l e t V € 
Conf(C a ) = Conf(Cb) such that u — pc a € B^ H (y). Furthermore, we fix w G Cfc with w — pc b G B^(2/). 
Hence, ||zt +pc b — Pc a — ^1 = ll u ~ Pc a — i w —Pc b )\\ can be upper bounded by 2Ai? = 2A • costdi am (C m ). 
For w G Cb the distance \\w — v\\ is bounded by diam(Cb) < costdiam(C m ). We conclude that merging 
clusters C a and Cb results in a cluster whose diameter can be upper bounded by 



2P +1 k 

diam(C a U C b ) < (1 + 2A) • cost dia m(C m ) + 4 \ opt fc . 

V m 

Using Observation |2T)1 and the fact that C a and Cb are part of the clustering Ct+i, we can upper bound 
the cost of C t by costdiam(Ct) < diam(C Q U Cb). □ 

Note that the parameter A from Lemma [221 establishes a trade-off between the two terms on the 
right-hand side of Inequality (IT2"j) . To complete the analysis of the first stage, we have to carefully 



choose A. In the proof of the following lemma, we use A = ln |/4d and apply Lemma |2"B1 for 



loe 



I A" | 
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consecutive phases, where a = (42d) d . Then, we are able to upper bound the total increase of the cost 
by a term that is linear in d and r and independent of \X\ and k. The number of remaining clusters 
is independent of the number of input points \X\ and only depends on the dimension d and the desired 
number of clusters k. 

Lemma 24. Let 2 a+1 k < \X\ for a — (42d) <i . Then, on input X, Algorithm^ computes a clustering 
C 2 °+i k with costdiam (<V+ifc) < (28d + 4) ■ opt fe . 



Proof. Let u 



log 4 



2" + 1 k 
1*1 



and define to,- 



ln 3/4<2. This implies p < a for the parameter p of Lemma[l 



for aU i = 0, . . . , u - 1. Since [^J 



'3V 
■,4 



y\x\ 



< 



for all i = 0, . . . , u. Furthermore, let A = 
Then, m u < 2 a+1 k and m l > 2 a+1 k > 2P +1 k 

TOj+i and 



(!) <+1 l*l + fJ < 



x\ 



Algorithm 12] uses a greedy strategy, we deduce costdiam (C mi+1 ) < costdiam (Ci -±^h j ) for alH = 0, . . . , u— 1. 
Combining this with Lemma l2"31 (applied to to = rrii), we obtain 



COStdiam (Cm i+ i) < (1 + 2A) • COStdiam (C m J + 4< 



opt A 



By repeatedly applying this inequality for i = 0, . . . , u — 1 and using costdiam (C^+ifc) — C0S tdiam(Cm„) 
and costdiam (C mo ) = 0, we get 



COStdiam (<V+i fc ) < ^ (l + 2A) ,; -4, 



2 CT +!fc 



i=0 



1*1 



opt* 



= 4W 2 " + a lfc opt fe Y I (1 + 2A) 1 - {/ 



Using u — 1 < logs 



1*1 



we deduce 



COStdiam 

(C 2 ,+i k ) < 4opt fc -^ 



1 + 2A 



i=0 



(13) 



By taking only the first two terms of the series expansion of the exponential function, we get 1 + 2A 
1 + < e~^~ — 2 ^/|- Substituting this bound into Inequality (TIB")) and extending the sum gives 

°° / i V °° / i 

COStdiam 

(e 2 „ +lfc ) <4opV23 — r] < 4 opt/c • ^2 ( TT9 



Solving the geometric series leads to 

COStdiam (C 2 <r+lfe) < 4 



2A 



1 • opt & < (28d + 4) • opt A 



□ 



3.3.3 Stage two 

The second stage covers the remaining merge steps until Algorithm [3] computes the clustering C 2 k- 
However, compared to stage one, the analysis of a single phase yields a weaker bound. The following 
lemma provides an analysis of a single phase of the second stage. It is very similar to Lemma [9] and 
Lemma [TU1 in the analysis of the discrete fc-center problem. 

Lemma 25. Let to e N with 2k < m <\X\. Then, 

COStdiam 

(C\ to I ) < 2 • (cOSt d iam(C m ) + Opt fe ) . 
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Figure 7: Merging two clusters intersecting with a ball of radius r. 



Proof. Let t := I %M ■ Then, C m CiCt+i is the set of clusters from C m which still exist — 1 < ^ merge 
steps after the computation of C m . In each iteration of its loop the algorithm can merge at most two 
clusters from C m . Thus, \C m r\C t +i\ > > k. Since X is (k, opt fc )-coverable there exists a point y £ M d 
such that B^ ptk (y) intersects with two clusters A, B e C rn n C t +\. We conclude that merging A and i? 
would result in a cluster whose diameter can be upper bounded by diam(AU B) < 2 costdiam (Cm) + 2 opt fc 
(cf. Figure[7]). The result follows using A,Be Ct+i and Observation [2U1 □ 

Lemma 26. Let n € N witfi n < 2 CT+1 fc and 2k < n < \X\ for a = (42d) d . I7ien, on input X, 
Algorithm^ computes a clustering C2k with 



COStdiam (C 2k ) < 2 ia (cOSt d iam(C„) + 2opt fc ) 



logs — 



and define : 



for alH = 0, . . . , u. Then, m u < 2k and > 2k for 



Proof. Let u := 

all i = 0, . . . , u— 1. Analogously to the proof of Lemma HM1 we get [^pj < rrii+x and using Lemma [231 we 
deduce costdiam(C mi+1 ) < 2 • (costdiam(C mi ) + opt fc )) for alii = 0, . . . , u — 1. By repeatedly applying this 
inequality and using costdiam(C 2 fc) < cost d i a m (C m „ ) , we get cost d iam(C 2 /c) < 2 U ■ (cost d iam(C„) + 2opt fc ). 



Hence using u < 



log 4 2° 



< 3ct, the result follows. 



□ 



Proposition 1221 follows immediately by combining Lemma UMl and Lemma [ 



3.3.4 Analysis of the remaining merge steps 

We analyze the remaining merge steps analogously to the fc-center problem. Therefore, in this section 
we only discuss the differences, most of which are slightly modified bounds for the cost of merging two 
clusters (cf. Figure [JJ. 

The connectivity property from Section ^. 2. 2l rcmains the same. However, for any two (Z, reconnected 
clusters A,B, we use 

diam(A UB)< diam(A) + diam(S) + 2r (14) 

as a replacement for Inequality ([5]). Furthermore, Lemma 1161 also holds for the diameter ^-clustering 
problem, i.e. with costdiam (Cfc) < costdiam (^)- 

Using Inequality (T141 in the proof of Lemma [T7J we get 

diam(A U B) < cost d i a m(^'m) + costdiam {Pn) + 2opt fc 

as a replacement for Inequality © while Inequality (flU)) can be replaced by 

diam(Ai U A 2 ) < cost d iamCPm) + 2 • (costdiam ("Pn) + 2opt fc ). 

That is, for the diameter fc-clustering problem the two upper bounds arc different. However, the second 
one is larger than the first one. Using it in both cases, the inequality from Lemma [T71 changes slightly to 

COStdiam (P\m+l\ ) < COStdiam (Pm) + 2 ■ (costdiam (Pn) + 2opt fe ) . 

Together with costdiam (Pa) < 2 • costdiam (7\) + 2opt fc as a replacement for Inequality (TTT1) . the bound 
from Lemma IT51 becomes 

COStdiam (Vl+l) < 2(\0g 2 (£) + 2) • (COStdiam (Pn) + 2 Opt fc ) . 
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Thus, using Proposition [221 the upper bound for the cost of the I + 1-clustering of Y from Lemma \T§\ 
becomes 

cost diam (^+i) < 2(log 2 (£) + 2) • (2 3ff (28d+ 6) + 2) • opt fe 
for a = (42d) d . Analogously to Section 13.2.41 this proves Theorem I2T1 

3.4 Analysis of the one-dimensional case 

For d = 1, we are able to show that Algorithm [3] computes an approximation to Problem [3] with an 
approximation factor of at most 3. We even know that for any input set X C M the approximation 
factor of the computed solution is strictly below 3. However, we do not show an approximation factor of 

3 — e for some e > 0. The proof of this upper bound is very technical, makes extensive use of the total 
order of the real numbers, and is certainly not generalizable to higher dimensions. Therefore, we omit 
it. 

4 Lower bounds 

In this section, we present constructions of several input sets yielding lower bounds for the approximation 
factor of Algorithm [3] To this end, we look into possible runs of the algorithm. Whenever Algorithm [3] 
is able to choose between several possible merge steps generating a cluster of equal minimum diameter, 
we simply assume that we can govern its choice. 

In Section I4TT1 we show that for any input set Icl (i.e. d = 1) Algorithm [3] has an approximation 
factor of at least 2.5. In Section I3~4l we stated that in this case Algorithm [3] computes a solution to 
Problem [3] with approximation factor strictly below 3. Hence, for d = 1, we obtain almost matching 
upper and lower bounds for the cost of the solution computed by Algorithm [3J 

Furthermore, in Section 14. 2 [ we show that the dimension d has an impact on the approximation factor 
of Algorithm [31 This follows from a 2-dimensional input set yielding a lower bound of 3 for the metric 
based on the ^oo-norm. Note that this exceeds the upper bound from the one-dimensional case. 

Moreover, in Section l4~4l we show that there exist input instances such that Algorithm[3j computes an 
approximation to Problem [3J with an approximation factor of fi( -{/log k) for metrics based on an £ p -norm 
(1 < p < oo) and Sl(logfc) for the metric based on the £oo-norm. In case of the l\- and the £oo-norm, 
this matches the already known lower bound [1] that has been shown using a rather artificial metric. 
However, the bound in [3] is derived from a 2-dimensional input set, while in our instances the dimension 
depends on k. 

Finally, we will see that the lower bound of fl( -{/log k) for any £ p -norm and f2(log k) for the ^-norm 
can be adapted to the discrete fc-center problem (see Section 14.4. 1[) . In case of the ^-norm, we thus 
obtain almost matching upper and lower bounds for the cost of the solution computed by Algorithm [TJ 
Furthermore, we will be able to restrict the dependency on d and k of the approximation factor of 
Algorithm [1] 

4.1 Any metric and d = 1 

We first show a lower bound for the approximation factor of Algorithm [3] using a sequence of input sets 
from M. d with d = 1. Since up to normalization there is only one metric for d = 1, without loss of 
generality we assume the Euclidean metric. 

Proposition 27. For all e > and k > 4 there exists an input set X C K such that Algorithm 
computes a solution to Problem^ with cost at least \ — e times the cost of an optimal solution. 

Proof. We show how to construct an input set for k = 4. The construction can easily be extended for 
k > 4. For any fixed n G N, we consider the following instance. For x € K, we define a set V(x) consisting 
of 2™ equidistant points: 

V(x) := {x + i | i G N and < i < 2™}. 
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Figure 9: A part of the dendrogram for W{xi). 

That is, neighboring points are at distance 1 and diam(T^(ir)) = 2" — 1. Furthermore, we define: 

l{x) :=x-2 n -\ 

r{x) := x + 2" - 1 + 2"" 1 = a; + 3 • 2"" 1 - 1, 
PF(a;) := V(x) U {Z(x), r(x)}. 

It follows that di&m(W(x)) = 2 n+1 — 1 as shown in Figure |U 
We define the following input set X: 

4 

X := [J Wfo) 

i=l 

where Xj := i • (7 • 2 n_1 — 2) for i — 1, . . . , 4. Then, there is a gap of 3 • 2 n_1 — 1 between W n (xi) and 
W n (x i+1 ), i.e. 

diam({r(a; i ),Z(a; i+ i)}) = 3 ■ 2™" 1 - 1 for i = 1, . . . , 3. (15) 
The optimal 4-clustering of A is 

C 4 opt := {W^ii), W(x 2 ), W{x 3 ), W(x 4 )} 

and costdiam(C4 P *) = 2" +1 — 1. However, the solution computed by Algorithm [3] may be worse. At the 
beginning, the minimum distance between two points from A is 1. The possible pairs of points with 
distance 1 come from the sets V{xi) for i = 1,...,4. Since the distance between V{xi) and l(xt) or 
r(xi) is 2™~ 1 , we can assume that the algorithm merges all points of V(xi) for % = 1, . . . , 4 as shown in 
Figure IH1 It follows that Algorithm [3] computes the following 12-clustering: 

C 12 = {{l(x 1 )},V(x 1 ),{r(x 1 )}, 
{l(x 2 )},V(x 2 ),{r(x 2 )}, 
{l(x 3 )},V(x 3 ),{r(x 3 )}, 

{Z(z 4 )},n*4),{r(z 4 )}}. 

For i = 1, . . . , 4, the diameters of {l(xi)} U V(xt) and V{xi) U {r(xi)} are equal to 3 • 2"~ 1 — 1 and 
these are the best possible merge steps. Therefore, by (TT5")) . we can assume that Algorithm [3] merges 
r(xi) and Z(xj+i) for i = 1, . . . , 3 first. This results in the following 7-clustering: 

C 7 = {{l{x 1 )}\JV{x 1 ), 

{r( Xl ),l(x 2 )},V(x 2 ), 
{r(x 2 ),l(x 3 )},V(x 3 ), 
{r(x 3 ),Z(a;4)},T/(x4)U{r(x4)}} 
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Figure 10: A part of the dendrogram for X. 

where V(x 2 ) and V(x3j have a diameter of 2™ — 1 while the remaining clusters have a diameter of 
3 • 2™ _1 — 1 (sec FigureQII]). Between two neighboring clusters of C7, there is a gap of 2™" 1 . 

In the next step of Algorithm [31 the best possible choice is to merge {r(xi), 1(002)} with V(x 2 ), 
{r(x2),l(x3)} with V(x2) or V(xs), or {r(xz), l(xi)} with V(xs). We let it merge {r(xi), £(22)} with 
V(x2) and {r(x3), /(x.4)} with V(x3). This results in a 5-clustering where the clusters have alternating 
lengths of 3 • 2™ _1 — 1 and 3 • 2" — 2 with gaps of 2™ _1 between them. Then, in the step resulting in C4, 
Algorithm [3] has to create a cluster of diameter 5 • 2™ — 3 as shown in Figure [TO] Therefore, the computed 
solution has an approximation factor of 

cost diam (C 4 ) _ 5 ■ 2" - 3 
cost diam (Cf *) " 2«+i - 1 ' 

For n going to infinity this approximation factor converges from below to | . □ 
4.2 ^-metric and d = 2 

In this section, we give a construction that needs only eight points from M 2 and yields a lower bound 
of 3 for the metric based on the £oo-norm. Recall that in Section 13.41 we showed that for d — 1 the 
approximation factor of a computed solution is always strictly less than 3. Therefore, the lower bound 
of 3 for d = 2 implies that the dimension d has an impact on the approximation factor of Algorithm [3] 

Proposition 28. For the metric based on the ioa-norm, there exists an input set X C K 2 such that 
Algorithm^ computes a solution to Problem^ with three times the cost of an optimal solution. 

Proof. We prove the proposition by constructing an example input set for k = 4 (see Figure fTTj) . Consider 
the following eight points in M. 2 : 



A = 


(0,1), 


E = 


(-1,2), 


B = 


(1,0), 


F = 


(2,1), 


C = 


(0,-1), 


G = 


(1,-2), 


D = 


(-1,0), 


H = 


(-2,-1) 



The optimal 4-clustering of these points is 

CT = {{A, E}, {B, F}, {C, G}, {D, H}} 

which has a maximum ^-diameter of 1. However, it is also possible that Algorithm [3] starts by merging 
A with B and C with D. Then, in the third step, the algorithm will merge E or F with {A, B}, G or H 
with {C, D}, or {A, B} with {C, D}. We assume the latter. Thus, in the fourth merge step a cluster of 
£oo-diameter 3 will be created. □ 

4.3 Euclidean metric and d = 3 

For the Euclidean case, we are able to construct a 3-dimensional instance that yields a lower bound of 
2.56. This is below the upper bound of 3 from the one-dimensional case. Therefore, this instance does 
not show an impact of the dimension d in the Euclidean case as in the previous section. But this lower 
bound is still better than the lower bound of 2.5 from the one-dimensional case. This suggests that in 
higher dimensions it might be easier to construct good lower bounds. 
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Figure 11: Lower bound for the metric based on 
the ^oo-norm. 



Figure 12: Lower bound for the metric based on 
the ^2-norm. The points C,D,G,H have a z- 
coordinate of 0, while the points A,B,E,F have 
a z-coordinate of l^fx. 



Proposition 29. For the Euclidean metric there exists an input set X C M 3 such that Algorithm 
computes a solution to Problem^ with cost 2.56 times the cost of an optimal solution. 

Proof. We prove the proposition by constructing an example input set for k = 4 (see Figure [T2"]) . For 
any fixed i£l with < x < 2 consider the following eight points in R 2 : 



A = 




-1, 


1, 2V5), 


E = 


(- 


+ 


1- 


- V4- 


X 2 


, *y/x), 


B = 


( 


1, 


1, 2y/x), 


F = 


( 


1 + x , 


1- 


- V4- 


X 1 


, 2V5), 


C = 




-1, 


-i, 0), 


G = 


(- 


■(1 + x), 


-(1- 


- ^4- 


X 2 ] 


, 0), 


D = 


( 


1, 


-i, o), 


H = 


( 


1 + x , 


-(1- 


- V4- 


X 2 ] 


, 0). 



The optimal 4-clustering of these points is 

CT = {{A, E}, {B, F}, {C, G}, {D, H}}, 

which has a maximum ^-diameter of 2. However, since \\A — B\\ = \\C — D\\ = 2 it is possible that 
Algorithm [3] starts by merging A with B and C with D. Then, the cheapest merge adds one of the points 
E,F to the cluster {A, B} or it adds one of the points G 7 H to the cluster {C, D} or it merges {A,B} 
with {C, D}. We assume the latter. The resulting cluster {A, B, C, D} has a diameter of 2y2 + x. Then, 
in the fourth merge step, the algorithm will either merge one of the pairs E, F and G, H or one of the 
pairs E, G and F, H . The choice depends on the parameter x. Note that Algorithm [3] will not merge 
the cluster {A, B, C, D} with one of the remaining four points, since this is always more expensive. The 
diameter of the created cluster is maximized for x ~ 1.56. If we fix x = 1.56, the algorithm merges E 
with F or G with H. This results in a 4-clustering of cost 5.12, while the optimal solution has cost 2. □ 

4.4 £ p -metric (1 < p < oo) in variable dimension 

In the following, we consider the diameter /c-clustcring problem with respect to the metric based on the 
£ i-norm. We show that there exists an input instance in dimension O(k) such that Algorithm[3]computes 
a solution with an approximation factor of fi(logfc). 

Proposition 30. For the metric based on the £\-norm, there exists an input set X C R d with d — 
k + log 2 k such that Algorithm computes a solution to Problem with ^ log 2 k times the cost of an 
optimal solution. 



21 



Proof. For simplicity's sake, assume fc to be a power of 2. In the sequel, we consider the (k + log 2 k)- 
dimcnsional set X of \X\ — k 2 points defined by 



X = 



VI < i < k and b E {0, l} log * k 



Here, ej £ K fc denotes the i-th canonical unit vector. Consider the following fc-clustering 

C* k = {C b \be{0,l}^ k }, 
where for each b g {0, l} log 2 k cluster C b is given by 



C h = 



Vl<i<k 



The largest diameter of C£ is cost<ji a m(C|) = 2. Hence for opt fc , the diameter of an optimal solution, it 
holds 

opt fc < 2. (16) 



However, we find that 



diam 



h(h,b 2 ) if«=i 
2 + h(b l7 b 2 ) ifi^j 



where h{b\,b 2 ) denotes the Hamming distance between the strings 61,62 € {0, iy°&2 k . Hence, we may 
assume that Algorithm [3] starts by merging points [ei,0,6'] T and [ei,l,6'] T for all 1 < i < k and all 
6' e {0, l} lo S2( fc )- 1 ; thereby forming ifc 2 clusters of diameter 1. 

Next, we show inductively that Algorithm [3] keeps merging pairs of clusters that agree on the first 
k coordinates until the algorithm halts. To this end, assume that there is some number 1 < t < log 2 k 
such that the clustering computed so far consists solely of the clusters 



C 



(t) 

Lb 1 



6 €{0,1}' 



for all 1 < i < k and all 6' e {0, i} lo S2(fc)-*. Also note that this is the case with t = 1 after the first \k 2 
merges. In such a case, we have 



diam(cWuC. ib; 



it) \ _ 



t + h(bi, 62) if i = j 
2 + t + h{b 1 ,b 2 ) ifi^j 



Hence, as above, we may assume that in the next -^hprk 2 steps Algorithm |3] merges the clusters Cf^y 

and Cfj_ v for all 1 < i < k and all 6' e {0, i}i°g 2 (fe)-(*+i) . The resulting clusters are of diameter t + 1. 

Also, we have cf^ = c\%, U c\%, . 

Algorithm [3] keeps merging clusters in this way until after t = log 2 k rounds we end up with the 
fc-clustering Cu = {C%\ 1 < * < fc} where 



6 e {0, i} 1o& 



These clusters C< are of diameter log 2 fc. Comparing to (TTB]), we deduce that Algorithm [3] computes a 
solution to Problem [3J with at least \ log 2 fc times the cost of an optimal solution. □ 

Considering the diameter fc-clustering problem with respect to an arbitrary € p -metric (with 1 < p < 
00) , note that the behavior of Algorithm[3Jdoes not change if we consider the p-th power of the £ p -distance 
instead of the £ p -distance. Also note that for all x, y £ {0, l} d we have \\x — y\\v — \\x—y\\i. Since instance 
X from Proposition [30] is a subset of {0, l} d , we immediately obtain the following corollary. 
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Corollary 31. For the metric based on any £ p -norm with 1 < p < oo, there exists an input set X C M d 
with d = fc + log 2 fc such that Algorithm [3] computes a solution to Problem [3| with log 2 fc times the 
cost of an optimal solution. 

Additionally, considering the diameter fc-clustering problem with respect to the foo-metric, it is known 
that every n-point subset of an arbitrary metric space can be embedded isometrically into (W l ,i OQ ) [5]. 
Hence, the instance from Proposition (3UJ of size n — k 2 yields an instance in M. k satisfying the same 
approximation bound with respect to the ^-distance. We obtain the following corollary. 

Corollary 32. For the metric based on the loo-norm, there exists an input set IcI 11 with d = k 2 such 
that Algorithm^ computes a solution to Problem^ with | log 2 k times the cost of an optimal solution. 

4.4.1 The discrete fc-center problem 

The input instance X from Proposition 1301 also proves lower bounds on the approximation factor of the 
agglomerative solution to the discrete fc-center problem. To this end, just note that for the instance X 
in every step of the algorithm the minimal discrete radius of a cluster equals the diameter of the cluster. 
We immediately obtain the following corollaries. 

Corollary 33. For the metric based on any l v -norm with 1 < p < oo, there exists an input set X C K d 
with d = k + log 2 k such that Algorithm [7] computes a solution to Problem [JJ with ^/i log 2 k times the 
cost of an optimal solution. 

Corollary 34. For the metric based on the loo-norm, there exists an input set X C M. d with d — k 2 such 
that Algorithm^ computes a solution to Problem[J\ with h log 2 k times the cost of an optimal solution. 

Moreover, in case of the ^-norm, we obtain the following corollary. 

Corollary 35. For the metric based on the l2-norm, there exists an input set X C M d with d = 0(log 3 fc) 
such that Algorithm]^ computes a solution to Problem]^ with f2( v / logfc) times the cost of an optimal 
solution. 

Corollary [35] follows by embedding the instance from Corollary [33] into the (9(log 3 fc)-dimensional 
Euclidean space without altering the behavior of the agglomerative algorithm or the lower bound of 
fl(\/\og k) (Johnson-Lindenstrauss embedding [TO])- For this embedded instance, the bound given in 
Section I3TT1 states an upper bound of 20d + 2 log(fc) + 2 = 0(log 3 k) times the cost of an optimal solution. 
Hence, in case of the discrete fc-center clustering using the ^-metric, the upper bound from our analysis 
almost matches the lower bound. 

Furthermore, this implies that the approximation factor of Algorithm [1] cannot be simultaneously 
independent of d and log fc. More precisely, the approximation factor cannot be sublinear in \fd and in 

5 Open problems 

The main open problems our work raises are: 

• Can the doubly exponential dependence on d in Theorem [5l] be improved? 

• Are the different dependencies on d in the approximation factors for the discrete fc-center problem, 
the fc-center problem, and the diameter fc-clustering problem due to the limitations of our analysis 
or are they inherent to these problems? 

• Can our results be extended to more general distance measures? 

• Can the lower bounds for £ p -metrics with 1 < p < oo be improved to f2(log fc), matching the lower 
bound from [1] for all £ p -norms? 
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